Part 2: Secure Model Serving with Syft Keras

Now that you have a trained model with normal Keras, you are ready to serve some private predictions. We can do that using Syft Keras.

To secure and serve this model, we will need three TFEWorkers (servers). This is because TF Encrypted under the hood uses an encryption technique called multi-party computation (MPC). The idea is to split the model weights and input data into shares, then send a share of each value to the different servers. The key property is that if you look at the share on one server, it reveals nothing about the original value (input data or model weights).

We'll define a Syft Keras model like we did in the previous notebook. However, there is a trick: before instantiating this model, we'll run hook = sy.KerasHook(tf.keras). This will add three important new methods to the Keras Sequential class:

  • share: will secure your model via secret sharing; by default, it will use the SecureNN protocol from TF Encrypted to secret share your model between each of the three TFEWorkers. Most importantly, this will add the capability of providing predictions on encrypted data.
  • serve: this function will launch a serving queue, so that the TFEWorkers can accept prediction requests on the secured model from external clients.
  • shutdown_workers: once you are done providing private predictions, you can shut down your model by running this function. It will direct you to shutdown the server processes manually if you've opted to manually manage each worker.

If you want to learn more about MPC, you can read this excellent blog.


In [ ]:
import numpy as np
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import AveragePooling2D, Conv2D, Dense, Activation, Flatten, ReLU, Activation

import syft as sy
hook = sy.KerasHook(tf.keras)

Model

As you can see, we define almost the exact same model as before, except we provide a batch_input_shape. This allows TF Encrypted to better optimize the secure computations via predefined tensor shapes. For this MNIST demo, we'll send input data with the shape of (1, 28, 28, 1). We also return the logit instead of softmax because this operation is complex to perform using MPC, and we don't need it to serve prediction requests.


In [ ]:
num_classes = 10
input_shape = (1, 28, 28, 1)

In [ ]:
model = Sequential()

model.add(Conv2D(10, (3, 3), batch_input_shape=input_shape))
model.add(AveragePooling2D((2, 2)))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(AveragePooling2D((2, 2)))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(AveragePooling2D((2, 2)))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(num_classes, name="logit"))

Load pre-trained weights

With load_weights you can easily load the weights you have saved previously after training your model.


In [ ]:
pre_trained_weights = 'short-conv-mnist.h5'
model.load_weights(pre_trained_weights)

Launch the workers

Let's now create TFEWorkers (alice, bob, and carol) required by TF Encrypted to perform private predictions. For each TFEWorker, you just have to specify a host. We then make combine these workers in a cluster.

These workers run a TensorFlow server, which you can either manage manually (AUTO = False) or ask the workers to manage for you (AUTO = True). If choosing to manually manage them, you will be instructed to execute a terminal command on each worker's host device after calling cluster.start() below. If all workers are hosted on a single device (e.g. localhost), you can choose to have Syft automatically manage the worker's TensorFlow server.


In [ ]:
AUTO = False

alice = sy.TFEWorker(host='localhost:4000', auto_managed=AUTO)
bob = sy.TFEWorker(host='localhost:4001', auto_managed=AUTO)
carol = sy.TFEWorker(host='localhost:4002', auto_managed=AUTO)

cluster = sy.TFECluster(alice, bob, carol)
cluster.start()

Secure the model by sharing the weights

Thanks to sy.KerasHook(tf.keras) you can call the share method to transform your model into a TF Encrypted Keras model.

If you have asked to manually manage servers above then this step will not complete until they have all been launched. Note that your firewall may ask for Python to accept incoming connection.


In [ ]:
model.share(cluster)

Serve model

Perfect! Now by calling model.serve, your model is ready to provide some private predictions. You can set num_requests to set a limit on the number of predictions requests served by the model; if not specified then the model will be served until interrupted.


In [ ]:
model.serve(num_requests=3)

You are ready to move to the Part 13c notebook to request some private predictions.

Cleanup!

Once your request limit above, the model will no longer be available for serving requests, but it's still secret shared between the three workers above. You can kill the workers by executing the cell below.

Congratulations on finishing Part 13b: Secure Classification with Syft Keras and TFE!


In [ ]:
model.stop()
cluster.stop()

if not AUTO:
    process_ids = !ps aux | grep '[p]ython -m tf_encrypted.player --config' | awk '{print $2}'
    for process_id in process_ids:
        !kill {process_id}
        print("Process ID {id} has been killed.".format(id=process_id))

In [ ]: